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A theory of preferential linking 



There are diverse mechanisms driving the evolution of social networks. A key open question dealing with 
understanding their evolution is: How various preferential linking mechanisms produce networks with dif- 
ferent features? In this paper we first empirically study preferential linking phenomena in an evolving on- 
line social network, find and validate the linear preference. We propose an analyzable model which captures 
the real growth process of the network and reveals the underlying mechanism dominating its evolution. 
Furthermore based on preferential linking we propose a generalized model reproducing the evolution of on- 
line social networks, present unified analytical results describing network characteristics for 27 preference 
scenarios, and explore the relation between preferential linking mechanism and network features. We find 
that within the framework of preferential linking analytical degree distributions can only be the combina- 
tions of finite kinds of functions which are related to rational, logarithmic and inverse tangent functions, 
and extremely complex network structure will emerge even for very simple sublinear preferential linking. 
This work not only provides a verifiable origin for the emergence of various network characteristics in online 
social networks, but bridges the micro users' behaviors and the global organization of online social networks. 

1. INTRODUCTION 

In real life not everyone is equally popu lar, and in soci al n etworks also not every- 
one possesses the same status or position. [Moreno (1934| ) and Jennings (1943 1 discov- 



ered that some individuals tend to be at the center of social networks while others 
rem ain on the periphery. This r ealization gave rise to the concept of network central- 



ity ( Borgatti and Everett 2006]> . Centrality has important effects on the evolution of 
social networks. Degree centrality, i.e. the number of ties that an actor possesses, has 
received particular attention maybe due to its computational simplicity. In many real- 
world social networks, researchers have found that most actors have only a few ties, 
while a small number have extraordinarily many. For instance [Liljeros et al. (2001| l 
found that degree distribution is highly skewed in sexual contact networks, where 
some super-connecter actors acquire as many as 1000 partners. Similar patterns 
also exist in movie co-appearance network, and numerous co-authorship networks in 
academia (Albert and Barabasi 2002). 

In the past few years, Web 2.0 which is characterized by social collaborative tech- 
nologies, such as social networking site (SNS), blog, Wiki, video or photo sharing 
and folksonomy, h as attracted much attention of researchers from diverse disciplines 
dLazer et al. 20 09i As a fast growi ng business, many SNSs of di fferent scopes and pur- 
poses have e merged on the We b ( |d. m. boyd and Ellison 2007) , man y of which, such 



as Facebook dLewis et al. 20081 1. Renre n jZhao et al. 2012D. MvSpace jAhn et al. 20071 
[Wilkinson and The lwall 201011, Orkut dAhn et al. 20071 IMislove et al. 20071 1 and new- 
born Google-H ((Gong et al. 2012), are among the most popular sites on the Web. Users 
of these sites, by establishing friendship relations with other users, can form online 
social networks (OSNs). Like real-world social networks, in OSNs individual degrees 
also show obvious heterogeneity. An analysis of the 721 million users on Facebook 
found that a few individuals have 5000 friends ( a limit imposed by F acebook), more 



than 26 times as many as the average user's 190 (Ugander et al. 2011 1. 

One important reason social networks develop such a high variance in actors' de- 
grees is that the number of ties an actor possesses affects processes of attachment. 
Social connections tend to accrue to those who already have them, the consequence 
of which is that small differences in actor degree compound over time into a dis- 
tinct cumulative advantage (DiPrete and Eirich 2006; Rivera et al. 2010). In OSNs the 



creation of links between individual users h as been studied in a number of contexts 
( [Opsahl and Hogan 20101 [Traud et al. 20121 ). and is believed to be driven by the prin- 
ciple of preferential attachment (PA), i.e. new users prefer to connect to old users with 
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higher degree. PA is widely recognized as the principal driving force behind the evo- 
lution of many growing networks. Besides the PA hypothesis stands as the accepted 
explanation behind the prevalence of scale-free organization in diverse evolving net- 
works. 

That to what extent PA works has been studied, qualitatively or quantitatively, in 
real-world and OSNs. However most of the researches are empirical and lack ana- 
lyzable models. Besides in network evolution when new users establish friend rela- 
tionship with old users, or new ties are established between old users, the old users 
with large degrees are all likely to be preferentially selected. However most previous 
researches either only focus on PA or combine the two cases into one, overlooking pos- 
sible preference of varying degrees for link establishment under different scenarios. To 
date, there are few analytical studies that bridge the micro preferential linking (PL, 
considering link establishment not only between old users and new users but between 
old users) and macrostructure of OSNs. A key open question dealing with understand- 
ing the evolution of OSNs is: How will the combination of linear PL, sublinear PL and 
randomized attachment generate networks with different characteristics? In this pa- 
per we exploit not only how linear PL leads to networks with scale-free feature (which 
has been partly studied in the past), but also what network features will result from 
diverse PL mechanisms, which has not been previously studied. 

In the reminder of this paper, after an overview of PA in social networks, we present 
a detailed case study based on real network dataset, following the procedure of net- 
work measurement, modeling, analysis, and model validation. We bring forward an 
analyzable model, which can reproduce the process of network growth and connect 
the PL mechanism and the network characteristics. Furthermore considering different 
forms of PL, we propose a generalized model for the evolution of OSNs, and present an- 
alytical results characterizing network features for diverse preference scenarios. From 
the perspective of sociology and economics we analyze the reasons why PL exists in 
OSNs. At last, we discuss the limitation of the paper. A research framework for better 
understanding the evolution of OSNs is presented. 



2. RELATED WORK 

Many social networks have a measured degree distribution P{k) that is either a power- 
law P{k) oc k^'^, or a power-law with an exponential cutoff Growing models have been 
proposed to account for these features, most of them being based on some form of PA. 
Generally PA means that when new nodes join the network linking to the existing 
nodes, the probability of linking i is an increasing function of the deg ree h of i. Some 
models assume this function to be linear (Barabasi and Albert 19991). while in other 



cases it has been assumed to depend on a different power of h ( [Krapivsky et al. 2001[ l. 
In general, we have that the probability n(fc,) with which an edge belonging to a new 
node connects to an existing node i of degree h will be n(fc,) cx fcf , where /? > 0. For 
/3 = 1 the rate is linear and the model red uces to the familiar BA mo del which yields a 
power-law degree distribution with 7 = 3 ( IBarabasi and Albert 1999] l. For /3 < 1 the PA 
is sublinear and P{k) is a stretched exp onential P{k) oc fc~^ ex p [— (6(7)/(l — 7))fc^^''], 
where is a constant depending on 7 ( [Krapivsky et al. 200 The absence of PA is 
attained in the limit (3 = 0, when the attachment rule is independent of degree. The 
resulting degree distribution in this case is given by P{k) oc exp(— fc/rn) where m is a 
constant. For /3 > 1 a single node gets almost all the edges, with the rest having an 
exponential distribution of the degrees. Therefore, to know which kind of PA, if any, 
is at work in a particular growing network, one needs to study empirically networks 
for which the time at which new nodes entered the network and new edges formed is 
known. 
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In recent years some empirical researches have verified the existence of a PA rule 
for social networks, including real-world and online, and exponent /? has also been es- 
timated for several networks. However there are some differences as for the functional 
form of n(/si). In some cases it appears to be quite close to linear, while in other cases 
it has been found to be sublinear 

For real-world social networks, [Newman (200 l| l studied scientific collaboration net- 
works and found that researchers in physics and biology who already had a large 
number of collaborators are more likely to accumulate new collaborators in the fu- 
ture. By fitting data he obta ined /3 = 1.04 for Medline and /3 = 0.89 for the Los Alamos 
Archive. Jeong et al. (2003 1 explored the co-authorship network in the neuroscience 
field and the Hollywood co-cast actor network, and found that (3 = 0.79 for the co- 
authorship netwo rk and /3 = 0.81 for the co-cast actor network, implying sublinear PA. 
[Peltomaki and Alava (2006| ) studied growing collaboration networks from the IMDB 
and arXiv.org preprint server, and found that for the actor network the measured value 
of the exponent /3 w 0.65, for the astrophysics network p w 0.6, and for the condensed 
matter physics and high energy physics networks /3 w 0.75. |de Blasio et al. (2007| l 
tested the PA conjecture in sexual contact networks based on Norwegian survey data , 
and found evidence of nonrandom, sublinear PA. 

Recently due to the availability of data of evolving OSNs though they may be low- 
resolution or only a sample during a pe riod of time, PA mechanism has also been 
validated in OSNs. [Mislove et al. (20081 ) studied the evolution of Flickr and found 
that users tend to c reate and receive link s in proportion to their outdegree and in- 
degree, respectively. [Leskovec et al. (2008| l studied the evolution of Flickr, del.icio.us, 
Yahoo! Answers and Linkedin, and examined whether PA holds for the networks. They 
found that Flickr and del.icio.us show linear preference, and Yahoo! Answers shows 
slightly sublinear preference, /3 = 0.9. For Linkedin for low de grees, P = 0.6; how ever, 
for large degrees, /3 = 1.2, indicating superlinear preference. Garget al. (20091 ana- 
lyzed an evolving online social aggregator FriendFeed and f ound that for source node 
selection (3 = 0.8 and for destination node selection, 13 = 0.9. |Szell and Thurner (2010| l 
studied a massive multiplayer online game Pardus. They measured indegrees of char- 
acters who are marked by newcomers as friend (enemy) and found th at (3 ~ 0.62 for 
friend markings with fcjn < 30, and (3 ~ 0.90 for all enemy markings. Aie llo et al. (2010) 
investigated the dynamical properties of aNobii and tested PA mechanism. They 
obtained a linear behavior, both when considering for k the in and the outdegree. 
IRocha et al. (2010| studied the sexual networks of Internet-mediated prostitution ex- 
tracted from a forum-like Brazilian Web community and found that sex-buyers exhibit 
sublinear PA for both short and long intervals. They also observed close to linear PA 
for sex sellers for short time intervals, whereas longer time intervals are associated 
with sublinear PA. This means that feedback pro cesses are stronger for shorter than 
for longer timescales. Moreover |Zhao et al. (2012l l studied the evolution oiRenren, the 
largest OSN in China, and found that /3 is not a constant over time. (3{t) decreases 
as the network grows which indicates that the influence of PA on network evolution 
weakens with the growth of Renren. 

From the previous theoretical and empirical researches we find that although the 
basic idea of PA is already well established, the relation between the combination of 
various PL mechanisms and resulting network features has not been fully exploited, 
which is the primary goal of the paper. 
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<From, To, When> 
<f/,, Uy r,> Invite 
<f/,, f/j, T^> Invite 
<U^, (7,, r,> Accept 
<!74, U2, T^> Invite 
<;72, f/4, 7; > Accept 
<U2, U,. r(,> Accept 
<[/,, f/,, r,> Invite 
J/j, Accept 



Fig. 1 . Data format and evolution of OSN Wealink. 

3. CASE STUDY 

3.1. Dataset 

Uncovering how the micro-mechanisms of network growth lead to the macrostructure 
of OSNs is of paramount importance in understanding the evolution of OSNs; how- 
ever data privacy policy makes it difficult for researchers to obtain the data of evolving 
OSNs (Zhao et al. 2012). Thus it is very difficult to capture the process of network evo- 
lution due to the fact that detailed empirical data of network growth with time labels 
integrating the joining of new users and establishment of new friend relationship are 
still scarce. 

In this section we first study Wealink, a large Linkedln-\ike SNS whose users are 
mostly professionals, typically businessmen and office clerks. The network data, logged 
from 0:00:00 h on 11 May 2005 (the inception day for the Web 2.0 site) to 15:23:42 h on 
22 August 2007, include all friend relationship and the time of formation of each tie. 

The finial data format, as shown in Fig. 1, is a time-ordered list of triples <Ui, Uj, 
Tk> indicating that at time Tk user Ui sends a link request to user or accepts U/s 
previous friendship request and they become friends. Like Facebook and Renren only 
when the sent invitations are accepted will the friend relations be established. The 
online community is a dynamically evolving one with new users joining the network 
and new ties established between users. 

3.2. Preferential Linking 

Like some other OSNs the degree distribution of Wealink shows power-law feature. 
This kind of distribution can be produced through linear PA, as revealed by BA model. 
In addition to the dynamics that is due to new users joining the network (generally 
by creating a new account) and making friends with the old users, there is also the 
dynamics that results from active users interacting with each other. In real scenario of 
network growth when new users establish friend relationship with old users, or new 
ties are established between old users, the old users with large degrees are all likely 
to be preferentially selected. In this subsection we will give evidence supporting these 
hypotheses. 

Since many OSNs are consequence of bilateral decisions of a pair of users, not of 
their unilateral decisions, to test the preference feature for different types of link es- 
tablishment, we separate PL into three aspects: preferential acceptance, preferential 
creation, and PA. Preferential acceptance implies that, the larger an old user's degree 
is, the more likely she/he will be selected as friends by the other old users. Preferential 
creation implies that, the larger an old user's degree is, the more likely her/his link 
invitations will be accepted by the other old users. The meaning of PA remains un- 
changed, i.e. new users tend to attach to already popular old users with large degrees. 
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+ Preferential creation 
O Preferential acceptance 

• Preferential attachment 



Fig. 2. Preference characteristics in the evolution oiWealink. 

Let fcj be the degree of user i. The probabihty that user i with degree h is chosen can 
be expressed as 



fcf 



E 







(1) 



We can compute the probabihty n(fc) that an old user of degree k is chosen, and it is 
normahzed by the number of users of degree k that exist just before this step: 

[et =i)Afc„(t- 1) = k] 



oc 



(2) 



Y.,\{u:k^{t-l) = k}\ 

where et ~ v f\ ky{t - I) = k represents that at time t the old user whose degree is k 
at time t - 1 is chosen. We use [•] to denote a predicate (which takes a value of 1 if the 
expression is true, else 0). Generally, n(fc) has significant fluctuations, particularly for 
large k. To reduce the noise level, instead of n(A;), we study the cumulative function: 



K(fc) 



/ rr('^)dfc °^ ^^^^ 



(3) 



Fig. 2 shows the relation between degree k of users and preference metric k. Least 
squares linear regression gives a = 1.93 ± 0.01(i?^ ~ 0.99) for preferential creation, 
a = 1.97 ± 0.01(i?2 = 0.98) for PA and a = 2.06 ± 0.01(7?^ ^ 0.99) for preferential 
acceptance. All are with significance level p < 2.2 x 10~^^. Thus /3 w 1 indicating linear 
preference. 

3.3. Model 

Like other OSNs the evolution otWealink includes two processes. The first one is that a 
new user joins in the network and establishes friend relation with an old user already 
present in the network. The second one is that a friend relation is established between 
two old users. Certainly there exists the case that a tie forms between two new users; 
however the situation is rare in real world and can be neglected. 

Based on the linear preference we bring forward the following network model. Start- 
ing with a small connected network with mo users, at every time step, there are two 
alternatives: 

A. With probability p, we add a new user with one edge that will be connected to 
the user already present in the network. The probability that the new user will be 
connected to old user i with degree fc, is n(fc,;) = h/J^j kj. 
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B. With probability q = I - p, we add one new edge connecting the old users. The 
two endpoints of the edge are also chosen according to linear preference. 

After t time steps the model leads to a network with mean number of users N{t) = 
mo + pt. For large t, N ^ pt and the total degree of the network /saii(t) ~ 2t. Applying 
mean-field approach for user i, we obtain 

dh ki fcj P + 2q 

The solution of Eq. (4) with the initial condition ki{ti) = 1 is 

h^{t/uY-^. (5) 

Thus 

P{h <k) = P{U >k~l^ -t). (6) 
The probability density of ti for large t is 

P^{U)^l/[mo + tp)^l/{tp). (7) 

From Eq. (6) we obtain 

P{h < fc) = 1 - P{U < fc~5H^ . i) = 1 - p-i . k~^. (8) 

Thus the probability density for P(A;) is 

dP(h<k) ,_4^ 

P(k) = '- (xk —. (9) 

ok 

The exponent 7 <E (2,3] and when p = 1 the model is reduced to BA model. 

According to empirical data, we obtain p ~ 0.7941 and q = 0.1939. The links created 
between two new users are few and thus can be negligible. Based on the parameters 
and Eq. (9), we obtain P{k) oc k~^-^'^. Fig. 3 shows the numerical result which is ob- 
tained by averaging over 10 independent realizations with p = 0.7941 and the same 
number of users as Wealink. Its degree exponent 2.62 agrees well with the predicted 
value of 2.67. Fig. 3 also presents the complementary cumulative degree distribution 
of Wealink. We fit the network data with power-law model utilizing Maximum Likeli- 
hood Estimate method and obtain 7 = 2.91. The predicted value of the degree exponent 
2.67 of the model achieves proper agreement with the real value 2.91. We also compute 
p-value for the estimated power-la w fit to the network i mplementing the Kolmogorov- 
Smirnov test and obtain p = 0.704 dClauset et al. 2009l l. We choose threshold 0.1, and 
thus the power-law fit is a good match to the degree distribution of Wealink. 

In real world different from the ideal model, the probability p cannot be stationary 
during the evolution of OSNs. In some stage p can be very large while in another 
stage p can be very small, which can lead to the difference between real exponent and 
predicted one. Fig. 4 shows the evolution of p and q, and demonstrates the fact. As a 
guide we also indicate the positions of p = 0.7941 and q ~ 0.1939. 

4. GENERAL MODEL 

In ONSs new users are constantly joining the social networks, and create edges to- 
wards already present users. Very few users leave the network, and very few edges 
disappear between users which remain in the network. Edges on the other hand are 
created between already present users. Besides in the evolution of real OSNs, new 
users or edges are added into networks one by one, and previous empirical researches 
have also shown that in OSNs most preference exponent /? < 1. Thus we bring forward 
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Slope=-1.62 



Fig. 3. The complementary cumulative degree distributions of Wealink and the networks obtained by nu- 
merical simulations. 
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Fig. 4. Evolution of the fraction of two kinds of edges. Dashed line indicates p = 0.7941 while dotted line 
q = 0.1939. 

the following general network model. Still starting with a small connected network 
with mo users, however at every time step, there are another two alternatives: 

A. With probability p, we add a new user with one edge that will be connected to 
the user already present in the network. The probability that the new user will be 
connected to old user a with degree fc^ is n(fca) = where < a < 1. 

B. With probability q = 1 - p, we add one new edge connecting the old users. One 



endpoint b is chosen according to n(fcf,) = / J2] while another endpoint c is chosen 
according to n(fcc) = k^l ^7- > where < ^, 7 < 1. 



Thus 



where Q < p,q < I. 
According to 



dt 



k? 



1.1 ' 



(10) 



E,k]=2t 



(11) 
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Table I. The evolution of ki and corresponding P{k) when only linear PL or random 
attachment exists 



Case 


a 


b 


c 


dki/dt 


P{k} 


I 


Linear 


Linear 


Linear 




4 — p 

OC k 


II 


Linear 
Linear 


Linear 
Random 


Random 
Linear 


2t pt 


OC (kp + 2q)~^ 


III 


Linear 


Random 


Random 


pki 1 2g 
~'~ "pt 


OC (fcp2 + 4g)~(^+p) 


IV 


Random 


Linear 


Linear 


1 1 Qki 
t t 


OC (fci3 + l)"!^^') 


V 


Random 
Random 


Linear 
Random 


Random 
Linear 


1 1 Qki 
pt ~^ 2t 


oc{kpq + 2) V iJ 


VI 


Random 


Random 


Random 


p+2q 
pt 


OC C P + 2<J 



when < a < 1, J2j — where p < u <2. 

As users a, b and c can be chosen according to any one of three rules— random attach- 
ment, hnear PL and subhnear PL, there are 27 different scenarios for the evolution of 
ONSs. 

First we consider the situations where only linear PL or random attachment exists, 
i.e. a, /3, 7 = 1 or 0, and there are totally eight scenarios which can be divided into six 
cases. Utilizing the similar approach in Sec. 3, we get all their degree distributions 
which have been summarized in Tab. L It is not surprising that for case I linear PL 
will result in power-law distribution, and for case VI random attachment will lead to 
exponential distribution. However it is interesting that for the other cases, the combi- 
nation of linear PL component and randomized attachment component also will gen- 
erate networks with approximatively power-law distribution. Besides according to the 
variation range of degree exponent in Tab. I, obviously the introduction of randomized 
attachment can enhance the homogeneity of network structure. 

When sublinear PL exists, there are 19 different scenarios for the evolution of ki 
which can be divided into 12 cases and are shown in Tab. II. According to Lipschitz 
conditions there are unique solutions to ki. 

For case I we obtain 

dt t " ut 

which is Bernoulli's differential equation. Let z ~ kl~", thus 



(12) 



dz q,^ ^ 



ut 



(13) 



Therefore 



id* 



P{1 - a) J 



dt 



ut 



dt 



uq 

where c and ci are constants. Thus 



(14) 



P_ 

uq 



According to initial value fcj(<i) = 1, we obtain 

ki = 



uq 



uq 



(15) 



(16) 
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Table II. The evolution of ki when sublinear PL exists. < o,/?, 7 < 1 

and p < u,v,w < 2 



Case 


a 


b 


c 


dki/dt 




I 


Sublinear 


Linear 


Linear 


Vkf 

ITT 


_|_ Qkj 




II 


Linear 


LlJjiillCclI 

Linear 


Sublinear 


gkf 

ur 


, k- 
+ 2? 




III 


Sublinear 


Sublinear 


Sublinear 






qk"! 


IV 


Random 


Sublinear 


Sublinear 


i + 


ut ~^ 


vt 


V 


Sublinear 
Sublinear 


Sublinear 
Random 


Random 
Sublinear 


ut 


+ ^ 


^ pt 


VI 


Linear 


Sublinear 


Sublinear 




qk^ 
+ ^ 


qk-l 


VII 


Sublinear 
Sublinear 


Sublinear 
Linear 


Linear 
Sublinear 


ut 


vt 


1 iki 
2t 


VIII 


Random 
Random 


Sublinear 
Random 


Random 
Sublinear 




vt + 


9 

pt 


IX 


Sublinear 


Random 


Random 


ut 


2q 
~^ pt 




X 


Linear 
Linear 


Sublinear 
Random 


Random 
Sublinear 


P^i 
IT 


qk" 
+ ^ 


^ pt 


XI 


Random 
Random 


Sublinear 
Linear 


Linear 
Sublinear 


} + 


^ + 


qki 
"2F 


XII 


Sublinear 
Sublinear 


Linear 
Random 


Random 
Linear 


ut 


"f" pt ^ 2t 



Accordingly 



P{k) oc (ugfci-" +p) 



and for large fc, P(fc) oc fc 

Similarly for case II we obtain 



■1 \-(t^+i) 
P{k) oc ( -wfci-" + g 



and for large k, P{k) oc fc ^'^ 
For case III when a = /3 = 7 



thus 



'dt 



(1 + q)K 

ut 



(l-a)(l + g)lnf 



Accordingly 



P(fc) oc fc-" exp 



-wfc 



(17) 



(18) 



(19) 



(20) 



(21) 



(l-a)(l + <7)_ 

which is stretched exponential distribution. Empirical research has shown that for 
Facebook networks in different institutions stretched exp onential gives the best fit in 
comparison with power-law and log-normal distributions dTraud et al. 2012b . 
For case VI when /3 = 7, we have 

dkj _ ph "^qk^ 
~dt ~ ^ vt 



(22) 
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According to the derivation in case I, we obtain 

P 



ik)oc{^k^-^ + 2qy^^^'\ (23) 



and for large /c, P(fc) oc fc ^p'^^ ^\ 
For case VII when a = /?, we have 

(^^i _ ^ I 2^1 (24) 
dt ut 2t' 

Similarly we obtain 

P(fc)oc(!ffc-" + l)-[^'''], (25) 

and for large k, P{k) oc fc~(t+^~"). 

Using the common approaches, including mean-field, rate equation and master 
equation, we can not obtain all analytical solutions to 27 different scenarios. We notice 
that Eq. (10) can be expressed as 

^ = (a^r + 5fcf + cfc7) i (26) 



where a, b and c are constants. Namely 

fe.(t) 



Ini. (27) 



{akf+bkf + ck]) U 



fc.(t.) 

The complementary cumulative degree distribution of networks can be written as 



Pc(fc) occ "'•f +'"=r+'=^' . (28) 

Let ni, n2 and ns be non-negative integers, m be positive integer, and a = ni/m, (3 
nilm and ^ — n^/m. Further let s = k^^^^ then 



/i afcf + 6fcf + cfc7 Ji as"i + 6s"2 + cs"3 ■ 
Suppose that ni > n2 > and let 



(29) 



= -PC*) + (30) 



as"i + 5s"2 -I- cs"3 as"i-"3 + 6s"2-"3 + c ^ ^ Q{s 

where P{s) and P(.s) are polynomials with dcgP < dcgQ. Furthermore suppose that 
the polynomial Q{s) has I distinct complex conjugate pairs of roots rji ± i^i, . . ., rji ± i/x; 
and k distinct real roots Ai, . . ., A^, then we have 

I k 
i=l i=l 
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where m; and rn denote the multipHcities of the roots. For P{s)/Q{s) there exist real 
constants Aij , Bij and Cij such that 



I 771 i 

EE 



k rii „ 

EE 



(32) 



The second term of the right-hand side of Eq. (32) can easily be integrated. For the 
first term when j = 1 we have 

A+Bf] 



(s-ry) +^ 
and when j > 1 

A + Bs 



A+Bs ^ B , 
as= — 111 
2 



-B 



■ arctan 



S — f] 



where Ji{z) ~ arctan z and 



20- -1) (s-vf+f,^ 



Jj+iiz) 



2.7 - 1 
2.7 



A + Bt] I s — rj 



J,{z). 



(33) 



(34) 



(35) 



2j{z^ + If 

Thus according to Eqs. (30)-(35), the primitive function of Eq. (30) can only be the sum 
of rational functions, logarithmic functions and inverse tangent functions, and for all 
scenarios in the generalized model, we can analytically obtain their degree distribu- 
tions though the expressions can be complex in most scenarios. 

In cases III-XII in Tab. II, for some special parameters of a, j3 or 7, we can easily 
obtain the solutions to Pc{k). For example in case VIII, when [3 = 1/3 



Pc(k) oc exp 




3^2 3z;7(g2p) 



ds 



exp 



qps 
2q 



3w2fci/3 



g2p 



(36) 



Although quite controversial online friendship is thought to be vitally impor- 
tant for the well-being and social capital o f people (Dunbar 20121 lEUison et al. 20071 
IValenzuela et al. 20091 Valkenburg and Pet er 2007 ). We use Gini coefficient to quan- 
tify the inequality of the degrees of users ( Stirling 2007 1. Fig. 5 shows the numerical 
result which is obtained by averaging over 20 independent realizations. For Eq. (10) 
when a = 0.2, the corresponding numerical result for < /?, 7 < 1 is shown in Fig. 5(a). 
As expected along minor diagonal symmetrical pattern emerges. When 7 = 0.2 the 
corresponding numerical result for < a, /3 < 1 is shown in Fig. 5(b). The numerical 
simulations include all cases in Tab. II. It is evident that larger preference exponent 
will result in greater inequality of the degrees of users and the emergence of hubs, and 
thus larger Gini coefficient. Besides we find that from randomized attachment to PL 
there is a clear jump for network heterogeneity, which implies that PL can significantly 
enhance the inequality of individual social capital. 

5. WHY PREFERENTIAL LINKING 

Why people prefer to attach their links to others who have more links? Obviously 
in real life we make friends with someone not because she/he has many friends but 
she/he possesses some quality we expect and is also willing to make friends with us. 
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(a) (b) 




0.1 0.3 0.5 0.7 0.9 0.1 0.3 0.5 0.7 0.9 



Fig. 5. Gini coefficients of networks obtained by numerical simulations with p = 0.8 and N = 10*. 

Thus large degree predicates that the actor is a worthful and trustworthy person and 
making friends with her/him will benefit us. Many researches have found a positive 
association between an actor's degree and that actor's goal achievement, including 
creativity, job attainment, professional advancement, political influence and prestige. 
Thus a user's degree is a stand-in for her/his true fitness since direct performance data 
are costly to gather before the relationship is made. PL purportedly occurs because 
actors looking for new connections use an actor's degree as a proxy for her/his fitness. 
A profile owner with many friends will be judged as more popular than a profile owner 
w ith few friends (|Ui z20M- 

|Kim and Jo (2010l l proposed several interesting models and explained PA as rational 
equilibrium behavior. In fact people are not certain of the value that they can obtain 
from forming a link with someone. A person has an incentive to form a link with an- 
other who has many links because the number of her/his links can convey some infor- 
mation about her/his value; in an economic sense, the number of links can be a signal 
of the value of the person, i.e. the observable degree contains some information about 
her/his unobservable value. From the perspective of economics, if the return obtained 
by interacting with someone is greater than the cost, we like and are willing to con- 
tinue to maintain this relationship, especially when the benefit in this relationship 
outweighs the other possible relationship. The users with large degrees precisely are 
the persons from whom we can expect to get more profit. 

PL is widely used as an evolution mechanism of networks. However it is hard to 
believe that any indivi dual can get g lobal information and shape the network ar- 
chitecture based on it. |Li et al. (2010| l found that the global PA can emerge from 
the local interaction models, including the distance-dependent PA evolving model, 
the acquaintance network model and the connecting nearest-neighbor model. In fact 
lAiello et al. (2010| have found that many users join aNobii by creating links to pairs 
of already connected users. 

6. DISCUSSION 

As shown in Fig. 4, the probabilities p and q are time-variant and cannot be sta- 
tionary during t he real evo lution of OSNs. Besides the activity of users can weaken 
over time (Zhao et al. 2012| l. There exists a memory kernel which dominates the de- 
cline of users' activi ty and might be highly skewed, for example obeying power law 
dCattuto et al. 2007l l. Thus a more realistic model can be that p, q, a, (3 and 7 in Eq. 
(10) are all time-dependent. 
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Why two people become friends? This question has been widely and intensively 
studied in social psychology. Except PL there are diverse mechanisms which can lead 
to the formation of dyadic ties, such as homophily, relational or propinquity mecha- 
nisms and physical attractiveness, and they are intimately interwoven in the evolu- 
tion of real social network s and have been found working in the formation of OSNs 
dRivera et al. 20101 [Garg e t al. 2009). For example homo phily has been found in Face - 
book dWimme r and Lewis 2010), Microsoft Messenger dLeskovec and H orvitz 2008K 
LiveJournal (Lauw et al. 2 010j, aN obii (Aiello et al. 20 10), MySpace ghelwall 2009) 
and online dating sites (Fiore and Donath 2005: Sko pek et al. 2011) . For relational 
mechanism, the connecting nearest-neighbor model has been proposed to explain 
the mechanism ( [Vazquez 2003|l and empirical research has shown that this mech- 
anism is at work in aNobii dAiello et al. 2010) . Besides although the Internet tran- 
scends some of the limit ations of p hysical space, proxim ity still matters in OSNs 
dLiben-Nowell et al. 20051 [Amichai-Hamburger et al. 2013 1, especially for online dat- 
ing in which a face-to-face relationship is the goal. Moreover it is obvious that 
beauty matters and can make people pleasant. For online dating or heterosexual ex- 
chan ges, good appeara nce is an important factor that users can offer and strongly 
seek dFiore et al 20081 ). PL can account for the degree distribution of OSNs; how- 
ever it cannot explain the other structural or sociological characteristics of the 
networks. A deeper understanding of these me chanisms can allow us to better 



model and predict structure and dynamic s of OSNs ( Liben-Nowell and Kleinberg 2007 



lAiello et al. 20121 . [Krivitsky et al. (2009[) made an etfort towards the goal. They pro 



posed a latent cluster random effects model to represent degree distributions, cluster- 
ing, and homophily in social n etworks, though the model is essentially statistical not 
growing dToivonen et al. 2009l >. 

Most conclusions of the article are theoretical, and need to be validated by empir- 
ical network datasets. Because of the diversity of purposes of SNSs, there can exist 
disparate mechanisms dominating the formation and evolution of OSNs. To the OSNs 
for general users, old users can incline to associate with others similar to themselves 
and homophily can dominate. While to the OSNs for professionals, old users can pre- 
fer to associate with the celebrities in the same vocation because personal success in 
occupation may benefit from the communication with them. Besides the relative impor- 
tance of different mechanisms is also different in different growth stages of OSNs. In 
the beginning stage users may incline to establish friendship relations with the users 
who are their friends in real life, while in the later stage users may prefer to make 
friends with the users whom they do not know in real life while they are interested in, 
which can result in the transition from degree assortativity to disassortativity. Con- 
sider the diversity of users and the fact that network growth mechanisms tend to be 
correlated with each other, for such multidimensional diversity and complexity, we 
could only simulate or reproduce one or several of the network characteristics. Incor- 
porating more social psychological and economic viewpoints and approaches into the 
modeling study of OSNs is beneficial to better understanding the formation of dyadic 
ties, which will be a possible future research direction though the analyses would be 
much more complex in that setting. 
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